fix: prevent spurious response headers when streaming thinking-model tokens#8
Merged
morganlinton merged 2 commits intoJun 29, 2026
Merged
Conversation
When a thinking model (qwen3, deepseek-r1) is used via Ollama, the
OpenAI-compat streaming API sends per-token chunks that may include an
empty content string alongside each reasoning token.
Before this change two bugs combined to create a 'new response header
per reasoning word' effect in the TUI:
1. agent.rs emitted AgentEvent::Text for every content delta, including
empty strings. render_text('') would open a fresh answer panel
(printing a new 'response ──' header) even though there was nothing
to show.
2. renderer.rs called end_answer() unconditionally on every Reasoning
event. This closed the just-opened (empty) answer panel, leaving
answer_wrap = None so the next empty Text delta would open yet
another panel. With hundreds of thinking tokens this produced
hundreds of spurious headers even with reasoning display turned off.
Fixes:
- agent.rs: skip AgentEvent::Text emission when content delta is empty.
- renderer.rs: only call end_answer() in the Reasoning handler when
reasoning display is on (otherwise the panel state is unaffected).
- renderer.rs: guard render_text() with an early return on empty delta
as a belt-and-suspenders safety against future empty deltas.
Tested with qwen3:8b and qwen2.5:7b via Ollama 0.30.8.
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Adjusts rendering/event emission to handle “thinking model” streams that include empty content deltas alongside reasoning, preventing per-token answer block resets and header spam.
Changes:
- Only close the current answer block on
Reasoningevents when reasoning display is enabled. - Ignore empty text deltas in the renderer and avoid emitting empty
Textevents in the agent. - Add targeted regression tests covering empty text deltas and reasoning toggle behavior.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/renderer.rs | Prevents empty text deltas and (when disabled) reasoning tokens from disrupting the answer block; adds regression tests. |
| src/agent.rs | Stops emitting AgentEvent::Text for empty content deltas to reduce downstream rendering churn. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+378
to
382
| if self.display.reasoning { | ||
| self.end_answer(); | ||
| } | ||
| self.render_reasoning(&delta) | ||
| } |
bennewell35
reviewed
Jun 26, 2026
bennewell35
left a comment
Contributor
There was a problem hiding this comment.
I reviewed the diff and ran the upstream-style checks in a disposable Docker Rust environment. The behavior tests pass, but the PR currently fails the required clippy gate.
Validation commands:
docker run --rm -v /tmp/smallharness-pr8:/work -w /work rust:latest bash -c 'rustup component add rustfmt clippy >/dev/null && rustc --version && cargo --version && cargo fmt --all -- --check && cargo clippy --all-targets -- -D warnings && cargo test'\n```\n\nResult:\n- `rustc 1.96.0` / `cargo 1.96.0`\n- `cargo fmt --all -- --check` passed\n- `cargo clippy --all-targets -- -D warnings` failed\n- Separate `cargo test` passed: 383 passed, 0 failed\n\nClippy failure:\n\n```text\nerror: field assignment outside of initializer for an instance created with Default::default()\n --> src/renderer.rs:820:9\n |\n820 | cfg.reasoning = false;\n\nerror: field assignment outside of initializer for an instance created with Default::default()\n --> src/renderer.rs:826:9\n |\n826 | cfg.reasoning = true;\n```\n\nMinimal fix should be to initialize those helpers directly, e.g.\n\n```rust\nfn renderer_reasoning_off() -> TuiRenderer {\n TuiRenderer::new(DisplayConfig {\n reasoning: false,\n ..Default::default()\n })\n}\n\nfn renderer_reasoning_on() -> TuiRenderer {\n TuiRenderer::new(DisplayConfig {\n reasoning: true,\n ..Default::default()\n })\n}\n```\n\nI did not live-test qwen3/deepseek-r1 via Ollama, so this is code/test validation rather than independent model-runtime validation. The patch looks narrow and the added renderer tests cover the stated state transitions, but it needs the clippy fix before it can pass the documented project gate.…ests Initialize DisplayConfig with a struct literal + ..Default::default() instead of mutating fields after default(), so the repo's `cargo clippy --all-targets -- -D warnings` gate passes in CI. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When using a thinking/reasoning model (qwen3, deepseek-r1, etc.) via Ollama, the interactive TUI renders a new
response ──header for every individual reasoning token — producing hundreds of blank panels and making the session unusable.Root cause: Two independent bugs that compound each other:
agent.rs— The streaming loop emitsAgentEvent::Textfor everycontentdelta, including the empty strings ("") that Ollama sends alongside eachreasoningtoken during the thinking phase.render_text("")opens a fresh answer panel (printing a new header) even though there is nothing to render.renderer.rs— TheReasoningevent handler callsend_answer()unconditionally, closing whichever answer panel was just opened. This leavesanswer_wrap = None, so the next emptyTextdelta opens yet another panel. With hundreds of thinking tokens, this cycle produces hundreds of spurious headers — even whendisplay.reasoningisfalse.Fix
Three minimal changes:
agent.rs(1 line): SkipAgentEvent::Textemission when the content delta is empty.renderer.rs(3 lines): Only callend_answer()in theReasoninghandler whendisplay.reasoningistrue; when reasoning display is off, the answer panel should be unaffected by reasoning tokens.renderer.rs(3 lines): Guardrender_text()with an early return on empty delta as defence-in-depth against future empty deltas from any backend.Tests
Three unit tests added to
renderer.rs:empty_text_delta_does_not_open_answer_block— emptyTextevent leavesanswer_wrapasNonereasoning_event_preserves_answer_block_when_reasoning_display_off—Reasoningevent does not close an in-progress answer whenreasoning: falsereasoning_event_closes_answer_block_when_reasoning_display_on—Reasoningevent correctly closes the answer block whenreasoning: true(existing intended behaviour)All 383 existing tests continue to pass.
Verified with